========================================================

Introduction

Data soruce : Prosper Loan data

I choose to explore dataset of the “Loan data from prosper”, from Prosper.com Prosper is a peer-to-peer lending marketplace. Borrowers make loan requests and investors contribute the loans of their choice. Once the process is complete, borrowers make fixed monthly payments and investors receive a portion of those payments directly to their Prosper account. I am interested to explore which factor will yield effectiveness benefit for the borrowers and the lender.


Overview of the data

## [1] 113937     81
## Classes 'tbl_df', 'tbl' and 'data.frame':    113937 obs. of  81 variables:
##  $ ListingKey                         : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
##  $ ListingNumber                      : int  193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
##  $ ListingCreationDate                : Factor w/ 113064 levels "2005-11-09 20:44:28.847000000",..: 14184 111894 6429 64760 85967 100310 72556 74019 97834 97834 ...
##  $ CreditGrade                        : Factor w/ 9 levels "","A","AA","B",..: 5 1 8 1 1 1 1 1 1 1 ...
##  $ Term                               : int  36 36 36 36 36 60 36 36 36 36 ...
##  $ LoanStatus                         : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
##  $ ClosedDate                         : Factor w/ 2803 levels "","2005-11-25 00:00:00",..: 1138 1 1263 1 1 1 1 1 1 1 ...
##  $ BorrowerAPR                        : num  0.165 0.12 0.283 0.125 0.246 ...
##  $ BorrowerRate                       : num  0.158 0.092 0.275 0.0974 0.2085 ...
##  $ LenderYield                        : num  0.138 0.082 0.24 0.0874 0.1985 ...
##  $ EstimatedEffectiveYield            : num  NA 0.0796 NA 0.0849 0.1832 ...
##  $ EstimatedLoss                      : num  NA 0.0249 NA 0.0249 0.0925 ...
##  $ EstimatedReturn                    : num  NA 0.0547 NA 0.06 0.0907 ...
##  $ ProsperRating..numeric.            : int  NA 6 NA 6 3 5 2 4 7 7 ...
##  $ ProsperRating..Alpha.              : Factor w/ 8 levels "","A","AA","B",..: 1 2 1 2 6 4 7 5 3 3 ...
##  $ ProsperScore                       : num  NA 7 NA 9 4 10 2 4 9 11 ...
##  $ ListingCategory..numeric.          : int  0 2 0 16 2 1 1 2 7 7 ...
##  $ BorrowerState                      : Factor w/ 52 levels "","AK","AL","AR",..: 7 7 12 12 25 34 18 6 16 16 ...
##  $ Occupation                         : Factor w/ 68 levels "","Accountant/CPA",..: 37 43 37 52 21 43 50 29 24 24 ...
##  $ EmploymentStatus                   : Factor w/ 9 levels "","Employed",..: 9 2 4 2 2 2 2 2 2 2 ...
##  $ EmploymentStatusDuration           : int  2 44 NA 113 44 82 172 103 269 269 ...
##  $ IsBorrowerHomeowner                : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
##  $ CurrentlyInGroup                   : Factor w/ 2 levels "False","True": 2 1 2 1 1 1 1 1 1 1 ...
##  $ GroupKey                           : Factor w/ 707 levels "","00343376901312423168731",..: 1 1 335 1 1 1 1 1 1 1 ...
##  $ DateCreditPulled                   : Factor w/ 112992 levels "2005-11-09 00:30:04.487000000",..: 14347 111883 6446 64724 85857 100382 72500 73937 97888 97888 ...
##  $ CreditScoreRangeLower              : int  640 680 480 800 680 740 680 700 820 820 ...
##  $ CreditScoreRangeUpper              : int  659 699 499 819 699 759 699 719 839 839 ...
##  $ FirstRecordedCreditLine            : Factor w/ 11586 levels "","1947-08-24 00:00:00",..: 8639 6617 8927 2247 9498 497 8265 7685 5543 5543 ...
##  $ CurrentCreditLines                 : int  5 14 NA 5 19 21 10 6 17 17 ...
##  $ OpenCreditLines                    : int  4 14 NA 5 19 17 7 6 16 16 ...
##  $ TotalCreditLinespast7years         : int  12 29 3 29 49 49 20 10 32 32 ...
##  $ OpenRevolvingAccounts              : int  1 13 0 7 6 13 6 5 12 12 ...
##  $ OpenRevolvingMonthlyPayment        : num  24 389 0 115 220 1410 214 101 219 219 ...
##  $ InquiriesLast6Months               : int  3 3 0 0 1 0 0 3 1 1 ...
##  $ TotalInquiries                     : num  3 5 1 1 9 2 0 16 6 6 ...
##  $ CurrentDelinquencies               : int  2 0 1 4 0 0 0 0 0 0 ...
##  $ AmountDelinquent                   : num  472 0 NA 10056 0 ...
##  $ DelinquenciesLast7Years            : int  4 0 0 14 0 0 0 0 0 0 ...
##  $ PublicRecordsLast10Years           : int  0 1 0 0 0 0 0 1 0 0 ...
##  $ PublicRecordsLast12Months          : int  0 0 NA 0 0 0 0 0 0 0 ...
##  $ RevolvingCreditBalance             : num  0 3989 NA 1444 6193 ...
##  $ BankcardUtilization                : num  0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
##  $ AvailableBankcardCredit            : num  1500 10266 NA 30754 695 ...
##  $ TotalTrades                        : num  11 29 NA 26 39 47 16 10 29 29 ...
##  $ TradesNeverDelinquent..percentage. : num  0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
##  $ TradesOpenedLast6Months            : num  0 2 NA 0 2 0 0 0 1 1 ...
##  $ DebtToIncomeRatio                  : num  0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
##  $ IncomeRange                        : Factor w/ 8 levels "$0","$1-24,999",..: 4 5 7 4 3 3 4 4 4 4 ...
##  $ IncomeVerifiable                   : Factor w/ 2 levels "False","True": 2 2 2 2 2 2 2 2 2 2 ...
##  $ StatedMonthlyIncome                : num  3083 6125 2083 2875 9583 ...
##  $ LoanKey                            : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
##  $ TotalProsperLoans                  : int  NA NA NA NA 1 NA NA NA NA NA ...
##  $ TotalProsperPaymentsBilled         : int  NA NA NA NA 11 NA NA NA NA NA ...
##  $ OnTimeProsperPayments              : int  NA NA NA NA 11 NA NA NA NA NA ...
##  $ ProsperPaymentsLessThanOneMonthLate: int  NA NA NA NA 0 NA NA NA NA NA ...
##  $ ProsperPaymentsOneMonthPlusLate    : int  NA NA NA NA 0 NA NA NA NA NA ...
##  $ ProsperPrincipalBorrowed           : num  NA NA NA NA 11000 NA NA NA NA NA ...
##  $ ProsperPrincipalOutstanding        : num  NA NA NA NA 9948 ...
##  $ ScorexChangeAtTimeOfListing        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ LoanCurrentDaysDelinquent          : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ LoanFirstDefaultedCycleNumber      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ LoanMonthsSinceOrigination         : int  78 0 86 16 6 3 11 10 3 3 ...
##  $ LoanNumber                         : int  19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
##  $ LoanOriginalAmount                 : int  9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
##  $ LoanOriginationDate                : Factor w/ 1873 levels "2005-11-15 00:00:00",..: 426 1866 260 1535 1757 1821 1649 1666 1813 1813 ...
##  $ LoanOriginationQuarter             : Factor w/ 33 levels "Q1 2006","Q1 2007",..: 18 8 2 32 24 33 16 16 33 33 ...
##  $ MemberKey                          : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
##  $ MonthlyLoanPayment                 : num  330 319 123 321 564 ...
##  $ LP_CustomerPayments                : num  11396 0 4187 5143 2820 ...
##  $ LP_CustomerPrincipalPayments       : num  9425 0 3001 4091 1563 ...
##  $ LP_InterestandFees                 : num  1971 0 1186 1052 1257 ...
##  $ LP_ServiceFees                     : num  -133.2 0 -24.2 -108 -60.3 ...
##  $ LP_CollectionFees                  : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_GrossPrincipalLoss              : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_NetPrincipalLoss                : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_NonPrincipalRecoverypayments    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ PercentFunded                      : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ Recommendations                    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ InvestmentFromFriendsCount         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ InvestmentFromFriendsAmount        : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Investors                          : int  258 1 41 158 20 1 1 1 1 1 ...

Univariate Plots Section

I will begin by explore the trend of loan on prosper.com since originated.

Period time of loan

The Propser loans had two timing period.The first period from 2005/Q4 to July 2008/Q4, the second period start from 2009/Q2 to 2014/Q1 In 2009/Q1 they were enters a ‘quiet period’ while seeking regulatory approvals by SEC The loans had rising trend Y-Y.

Loan Category

Take an look which kind of loan purpose

##      Not Available Debt Consolidation   Home Improvement 
##              58308               7433               7189 
##           Business      Personal Loan        Student Use 
##               2395                756               2572 
##               Auto              Other      Baby&Adoption 
##              10494                199                 85 
##               Boat Cosmetic Procedure    Engagement Ring 
##                 91                217                 59 
##        Green Loans Household Expenses    Large Purchases 
##               1996                876               1522 
##     Medical/Dental         Motorcycle                 RV 
##                304                 52                885 
##              Taxes           Vacation      Wedding Loans 
##                768                771                  0 
##               NA's 
##              16965

The loan with “Not Available” had near by 60k listing on prosper.com If consider loan type excluding “Not Available” the top three of loan are for “Auto”, “Debt Consolidation” and “Home Improvement” respectively.

The Interest and return

Next, see the Borrower’s interest rate on the loan.

BorrowerRate are approximately normally distributed with mean and median are 0.19 and 0.18 respectively. EstimatedEffectiveYield also approximately normally distributed and look seem identical with BorrowerRate.

Summary of BorrowerRate

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.1340  0.1840  0.1928  0.2500  0.4975

Summary of LenderYield

How much return yield of the Lender for their funded.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.0100  0.1242  0.1730  0.1827  0.2400  0.4925

Term of loan

Which available terms for loans

##    12    36    60 
##  1614 87778 24545

The term of loan have 12/36/60 months lenght, and the lenght 36 month is most popular choice.

Amount Of Loan

Check which amount of loan of prosper.com

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4000    6500    8337   12000   35000

The loans amount are between 0 and 35,000 minimum loan is 1000$, 75% of loans are under 12,000

Borrowers Income

Take alook at income range for prosper borrower.

Summary of Income range

##  Not displayed   Not employed             $0      $1-24,999 $25,000-49,999 
##           7741            806            621           7274          32192 
## $50,000-74,999 $75,000-99,999      $100,000+ 
##          31050          16916          17337

Most of the borrowers have income ranging from $25,000 to $75,000, a few loans are had incomes below $25,000.

Summary of Monthly Income

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0    3200    4667    5608    6825 1750000

75% of borrower had the monthly incomes < 6,825, but the maximum is 1,750,000$, I think this is not make sense

DebtToIncomeRatio

Check The ratio of debt to income for borrower

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   0.140   0.220   0.276   0.320  10.010    8554

Almost borrowers have 22% in Debt compared with their income. 75% of the borrowers have debt less than 32% compared with their income

Prosper Credit Rating

Investigate the rating of borrower

##    AA     A     B     C     D     E    HR    NC    NA  NA's 
##  5372 14551 15581 18345 14274  9795  6935     0     0 29084

By ignore “N/A”, most borrower made with rating C, B, A, and D by Prosper Rating.

Now looking at investor information

Number of Investor

How many investor who invested on prosper

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    2.00   44.00   80.48  115.00 1189.00

The number of investors approximately lognormal distribution, a few investor who invest with huge loans.

Estimate Yield

Check the yeild for investor

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.005   0.042   0.072   0.080   0.112   0.366   29084

Summary of Loss yield

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.005   0.042   0.072   0.080   0.112   0.366   29084

Summary of Return yield

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  -0.183   0.074   0.092   0.096   0.117   0.284   29084

Estimat loss and return for investor look like indentical 12% and 11%


Univariate Analysis

What is the structure of your dataset?

The dataset contains 81 variables about 113937 loans made through the prosper.com marketplace. The loans cover the period 2005-11-15, 2014-03-12. Variables are of classes int, numeric, date, and factor.

What is/are the main feature(s) of interest in your dataset?

the main features of interest for me are the Lender Yield and Borrower Rate. I want to know what is the factors influencing to the Lender Yield and borrower rate.

What other features in the dataset do you think will help support your investigation into your feature(s) of interest?

I’m interested in ProsperRating, CreditScore, IncomeRange, Term, DebtToIncomeRatio and LoanCategory may help to investigate the LenderYield and BorrowerRate.

Did you create any new variables from existing variables in the dataset?

I created a single CreditRange that represent the average of CreditScoreRangeUpper and CreditScoreRangeLower.
Also created a factor variable result of ListingCategory and TermInmounth variables.

Of the features you investigated, were there any unusual distributions? Did you perform any operations on the data to tidy, adjust, or change the form of the data? If so, why did you do this?

The ListingCategory, IncomeRange features show unorder, I had reorder for better plot. I transformed StatedMonthlyIncome and DebtToIncomeRatio, which include long tail large value that seems outlier distribution.


Bivariate Plots Section

Next start explore the corelation between feature in the data set

Loan Category since Originatation

Start with trend of loan category since loan originations

Since 2007 Prosper had seven categories of loan, until in 2011, they has expanded to additional 13 loan, including all of 20 categories available for now

Term and Amount

Small loan was prefer for short term and big loan prefer for long term.

Loan Category and Amount

If exclude “Not Applicable”, and “Other”, the amount of lons for Home improvement and Baby&Adoption are majority choice on Prosper

Loan Status and Amount

## Source: local data frame [12 x 3]
## 
##                     group       sum ratio
##                    (fctr)     (int) (dbl)
## 1               Cancelled      8500  0.00
## 2              Chargedoff  76735809  8.08
## 3               Completed 235643536 24.81
## 4                 Current 586174602 61.71
## 5               Defaulted  32550755  3.43
## 6  FinalPaymentInProgress   1710955  0.18
## 7    Past Due (>120 days)    132500  0.01
## 8    Past Due (1-15 days)   6825567  0.72
## 9   Past Due (16-30 days)   2161454  0.23
## 10  Past Due (31-60 days)   3097964  0.33
## 11  Past Due (61-90 days)   2419496  0.25
## 12 Past Due (91-120 days)   2433209  0.26

Loan status look pretty good, past due < 2%

IncomeRange and LoanAmount

## 
##  Pearson's product-moment correlation
## 
## data:  pdf$StatedMonthlyIncome and pdf$LoanOriginalAmount
## t = 69.353, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1956816 0.2068243
## sample estimates:
##       cor 
## 0.2012595

Higher income borrower who can made higher loan amound, this may related payment possiblity.

Home owner and Credit Score

Look seem Credit score for home owner high than not own the home

CreditRating and Term

The number of loan by term and credit rating

36 month loan is the majority selection for all credit rating borrower. The investor may need to focus to putting their money this these loan peroid.

CreditRating and IncomeRange

The most borrower on prosper had income range between 25000$ - 100000$ which rating from AA-HR Most excellent rating (B-AA) borrower income range over 75000$ also most poor rating (D -HR) borrower income range between 25000$ - 50000$

Credit Rating and CreditScore

## 
##  Pearson's product-moment correlation
## 
## data:  pdf$ProsperRatingScore and pdf$CreditScoreRange
## t = 191.27, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.544155 0.553558
## sample estimates:
##       cor 
## 0.5488738

High Credit Score is assign to excellent rating (AA), and surprise is Poor prosper rating “HR” have high credit score that rating E

Credit Score and CurrentDelinquencies

## 
##  Pearson's product-moment correlation
## 
## data:  pdf$CurrentDelinquencies and pdf$CreditScoreRange
## t = -133.37, df = 113240, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.3734729 -0.3634055
## sample estimates:
##      cor 
## -0.36845

Delinquencies count seem tobe relation with the Credit score, low credit score borrower who increasing chance to made high delinquencies.

Credit Rating, and Interest

## CreditRating: AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.04000 0.06990 0.07790 0.07912 0.08450 0.21000 
## -------------------------------------------------------- 
## CreditRating: A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0498  0.0990  0.1119  0.1129  0.1239  0.2150 
## -------------------------------------------------------- 
## CreditRating: B
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0693  0.1414  0.1509  0.1545  0.1639  0.3500 
## -------------------------------------------------------- 
## CreditRating: C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0895  0.1765  0.1914  0.1944  0.2099  0.3500 
## -------------------------------------------------------- 
## CreditRating: D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1157  0.2287  0.2492  0.2464  0.2625  0.3500 
## -------------------------------------------------------- 
## CreditRating: E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1479  0.2712  0.2925  0.2933  0.3149  0.3600 
## -------------------------------------------------------- 
## CreditRating: HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1779  0.3134  0.3177  0.3173  0.3177  0.3600 
## -------------------------------------------------------- 
## CreditRating: NC
## NULL
## -------------------------------------------------------- 
## CreditRating: NA
## NULL
## 
##  Pearson's product-moment correlation
## 
## data:  pdf$ProsperRatingScore and pdf$BorrowerRate
## t = -917.37, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9537172 -0.9524846
## sample estimates:
##        cor 
## -0.9531049

The excellent rating is AA and is lower risk and then make lower interest. In the other hand, poor rating is HR that high risk, so reqire hig interest too.

Loan amount and Interest

## 
##  Pearson's product-moment correlation
## 
## data:  pdf$BorrowerRate and pdf$LoanOriginalAmount
## t = -117.58, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.3341283 -0.3237719
## sample estimates:
##        cor 
## -0.3289599

Small loan make high interest than bigger loan, this related to length of loan.

Home Owner and Interest

Borrower who home owner had receive small interest than other one who not own the home.

Now look at investor sode

Investor and Term

Mostt investment term on propser.com are 36 and 60 months .

Investor and Lender yield

## 
##  Pearson's product-moment correlation
## 
## data:  pdf$Investors and pdf$LenderYield
## t = -96.233, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2795354 -0.2687953
## sample estimates:
##        cor 
## -0.2741739

There are few investors who made higher yield not much correlation between these two variables. negative correlation -0.27


Bivariate Analysis

Talk about some of the relationships you observed in this part of the investigation. How did the feature(s) of interest vary with other features in the dataset?

For the borrower, BorrowerRate depend on CreditRating, Income, Term and Loand Amount. For Investor the retrun depend on BorrowerRate Term and Amount of their funded.

Did you observe any interesting relationships between the other features (not the main feature(s) of interest)?

I also observe when the borrower who are home owner, they average interest rate is smaller.

What was the strongest relationship you found?

I found the strongest relationshipbetween CreditRating and BorrowerRate (0.95).


Multivariate Plots Section

This section, main objective is to see how the observed relationships from the previous section.

Category of Loan

The total amount are growth up continuous over the time, especially since year 2013, the loan amount growth over 300% Consider on loan type that not avialable have double growth ration over time. The purpose of the loan are to be used for improve the quality of life rather than entertainment. The top three loan type are for Auto, Debt consolidation.

Interest Amount Income Rating

Borrower who had more income they increase a chance to increase their CreditRating and can lend more money with better interest.

Lender Yield for Investor by Term and Categories

36 month term make better investment yield for investor.

Lender Yield by Rating

The lender yield are explicitly different for each Credit Rating. Excellent rating borrower made low return but high return made by poor rating borrower

Investor yield by Term

This plot show since FY2011 most investor have no loss for theire investment. The most investment term is 36 and 60 mounth, and from FY2014 there is no investment fro short term.


Multivariate Analysis

Talk about some of the relationships you observed in this part of the investigation. Were there features that strengthened each other in terms of looking at your feature(s) of interest?

The Most Popular Loan is 36 months for all credit rating. The Lender Yield is gradually increasing from decreaseing Credit of borrower

Were there any interesting or surprising interactions between features?

My interesting that the ratio of BorrowerRate and CreditRating that explicitly different

OPTIONAL: Did you create any models with your dataset? Discuss the strengths and limitations of your model.

I did not create a model for my dataset.


Final Plots and Summary

Plot One

Loan Status by percentage since originated throug FY2017

##             
##              2005 Q4  2006 Q1  2006 Q2  2006 Q3  2006 Q4  2007 Q1 
##   Cancelled  "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Chargedoff "  0.00" "  7.00" " 13.00" " 15.00" " 20.00" " 23.00"
##   Defaulted  "  0.00" " 20.00" " 22.00" " 25.00" " 23.00" " 18.00"
##   Past Due   "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Current    "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Completed  "100.00" " 73.00" " 65.00" " 60.00" " 57.00" " 58.00"
##             
##              2007 Q2  2007 Q3  2007 Q4  2008 Q1  2008 Q2  2008 Q3 
##   Cancelled  "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Chargedoff " 27.00" " 27.00" " 25.00" " 24.00" " 25.00" " 23.00"
##   Defaulted  " 14.00" " 12.00" " 11.00" " 10.00" "  9.00" "  9.00"
##   Past Due   "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Current    "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Completed  " 59.00" " 61.00" " 64.00" " 66.00" " 66.00" " 68.00"
##             
##              2008 Q4  2009 Q2  2009 Q3  2009 Q4  2010 Q1  2010 Q2 
##   Cancelled  "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Chargedoff " 20.00" " 15.00" "  9.00" " 12.00" " 12.00" " 14.00"
##   Defaulted  "  8.00" "  8.00" "  5.00" "  3.00" "  3.00" "  3.00"
##   Past Due   "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Current    "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Completed  " 72.00" " 77.00" " 86.00" " 84.00" " 85.00" " 83.00"
##             
##              2010 Q3  2010 Q4  2011 Q1  2011 Q2  2011 Q3  2011 Q4 
##   Cancelled  "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Chargedoff " 12.00" " 14.00" " 15.00" " 18.00" " 16.00" " 15.00"
##   Defaulted  "  3.00" "  3.00" "  3.00" "  4.00" "  3.00" "  3.00"
##   Past Due   "  0.00" "  0.00" "  1.00" "  3.00" "  2.00" "  4.00"
##   Current    "  0.00" "  1.00" " 10.00" " 28.00" " 33.00" " 36.00"
##   Completed  " 84.00" " 81.00" " 71.00" " 47.00" " 46.00" " 43.00"
##             
##              2012 Q1  2012 Q2  2012 Q3  2012 Q4  2013 Q1  2013 Q2 
##   Cancelled  "  0.00" "  0.00" "  0.00" "  0.00" "  0.00" "  0.00"
##   Chargedoff " 14.00" " 13.00" " 11.00" "  8.00" "  3.00" "  2.00"
##   Defaulted  "  3.00" "  2.00" "  2.00" "  1.00" "  0.00" "  0.00"
##   Past Due   "  3.00" "  4.00" "  5.00" "  5.00" "  4.00" "  3.00"
##   Current    " 46.00" " 51.00" " 56.00" " 63.00" " 73.00" " 85.00"
##   Completed  " 34.00" " 30.00" " 26.00" " 23.00" " 19.00" "  9.00"
##             
##              2013 Q3  2013 Q4  2014 Q1 
##   Cancelled  "  0.00" "  0.00" "  0.00"
##   Chargedoff "  0.00" "  0.00" "  0.00"
##   Defaulted  "  0.00" "  0.00" "  0.00"
##   Past Due   "  3.00" "  2.00" "  0.00"
##   Current    " 91.00" " 96.00" " 99.00"
##   Completed  "  6.00" "  3.00" "  1.00"

Description One

The demand of loan had growth rapidly year by year since FY2011. This plot show the loans status on Prosper.com over 99%(as of FY 2009) are “Current” status. Since originated year the ratio of bad loan (Cancelled, Chargedoff, efaulted and Past Due) Look seem too high ratio ~30-40% during 2006-2008 and there are improve after 2009, and right now the loan status are so good. ratio of bad loan small than 3% since 2013 till now(Q1/2014) Any way, one of intersting is the result of complete status show dedreasing every year, as of 2014 complete ratio remaining 1% of result.

Plot Two

Summary of Interest rate by credit rating

## CreditRating: AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.04000 0.06990 0.07790 0.07912 0.08450 0.21000 
## -------------------------------------------------------- 
## CreditRating: A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0498  0.0990  0.1119  0.1129  0.1239  0.2150 
## -------------------------------------------------------- 
## CreditRating: B
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0693  0.1414  0.1509  0.1545  0.1639  0.3500 
## -------------------------------------------------------- 
## CreditRating: C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0895  0.1765  0.1914  0.1944  0.2099  0.3500 
## -------------------------------------------------------- 
## CreditRating: D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1157  0.2287  0.2492  0.2464  0.2625  0.3500 
## -------------------------------------------------------- 
## CreditRating: E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1479  0.2712  0.2925  0.2933  0.3149  0.3600 
## -------------------------------------------------------- 
## CreditRating: HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1779  0.3134  0.3177  0.3173  0.3177  0.3600 
## -------------------------------------------------------- 
## CreditRating: NC
## NULL
## -------------------------------------------------------- 
## CreditRating: NA
## NULL

Summary of loan amount by credit rating

## CreditRating: AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    6000   10940   11580   16000   35000 
## -------------------------------------------------------- 
## CreditRating: A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    5850   10000   11460   15000   35000 
## -------------------------------------------------------- 
## CreditRating: B
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    6000   10000   11620   15000   35000 
## -------------------------------------------------------- 
## CreditRating: C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    5000   10000   10390   15000   25000 
## -------------------------------------------------------- 
## CreditRating: D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4000    6100    7083   10000   15000 
## -------------------------------------------------------- 
## CreditRating: E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    3600    4000    4586    5000   15900 
## -------------------------------------------------------- 
## CreditRating: HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    3000    4000    3463    4000   16800 
## -------------------------------------------------------- 
## CreditRating: NC
## NULL
## -------------------------------------------------------- 
## CreditRating: NA
## NULL

Description Two

This chart shows the relationship of interest rates and loans on www.prosper.com Prosper offering interest rates in range as 4-40% while loans are in the range 1000-35000 $. The interest rate and loan amount depend on the borrower’s credit rating. We observed that the interest rate for excellent credit borrower’s are typically lower than the borrower who get poor credit rating. By considering borrowers who had excellent - very good (A-AA) credit, they have opportunity to receive interest rate between 4-20% and can borrow in range 1000-35000 $. And for the borrowers with poor credit ratings - moderate (HR-B) to the interest rate will be in the range of 10-40% and be able to borrow between 1000-35000 $ for the borrowers with moderate credit ratings (C-B) and from $ 1000-17000 for the borrower with the poor credit. However, up to the limit in trouble for something else, such as the term of loan, type of loan, etc.

Plot Three

Description Three

This plot shows that the borrower have generated impressive profits on prosper.com since 2009-2012 , we can see the ROI for 12-month short-term investment generated return ~ 4-10%, and or medium-term loans as 36 months, the return is 10-40%, and long-term loans for 60 months also made very hig return up to 20-60%. The return is much higher when ther borrower has a poor credit rating. Any way, after the year 2013, the return on investment fell for all loan team in all credit rating, especially in 2014 the return on investment declined sharply.


Reflection

The Prosper dataset has 113,937 record with 81 variables. I started by looking at the documentation and tried to selected 10-15 interesting variables and planed to use various plots to check and explain the reationship betweeen them. The difficulties I had with the data mainly stemmed from understanding the variables from this dataset and asking interesting questions, then planned selecting the appropriate technique to analyze. From my analyze I found the factor that will make to the borrower get lower interest is to make the excellent credit ratinng by increase increasign income, lower debt ratio or how the home. For investor the make high return they may explore the investment to these state such as Mississippi, Oregon, Iowa, Nebraska, North Dakota, Indiana, Ohio.

To continue from here, I would like to make a predictive model to compare against the data. I am looking forward to learning more about modeling and predictions in the Machine Learning class.